A justification for reporting the majority-rule consensus tree in Bayesian phylogenetics.

نویسندگان

  • Mark T Holder
  • Jeet Sukumaran
  • Paul O Lewis
چکیده

Systematists must frequently deal with substantial uncertainty in their phylogenetic estimates. Nonparametric bootstrapping (59) and Markov chain Monte Carlo (MCMC) simulations used for Bayesian phylogenetic inference (65; 64; 60) are two of the most popular computational approaches for assessing support for different parts of a phylogenetic tree. Both of these techniques produce large collections of trees. A majority-rule consensus tree is often used to summarize such a collection of trees. As has been discussed (e.g., in 50, and the ensuing debate), a consensus tree is a summary of a set of trees, and not necessarily an optimal estimator of the phylogeny. Here we present a context in which the majority-rule consensus tree of samples from the posterior probability distribution over trees can be viewed as the optimal tree to report. We explicitly rephrase phylogenetic inference as the problem of “what tree should I publish for this group of taxa, given my data?” The majority-rule consensus tree can be shown to be the optimal tree to report if we view the cost of reporting an estimate of the phylogeny to be a linear function of the number of incorrect clades in the estimate and the number of true clades that are missing from the estimate and we view the reporting of an incorrect grouping as a more serious error than missing a clade. The work of Berry and Gascuel (52) on reporting results from nonparametric bootstrapping overlaps significantly with the results presented here. Berry and Gascuel (52) present arguments from Bayesian decision theory, which is also the theoretical basis of our work. Berry and Gascuel focus on frequentist properties of estimators (type I and type II error rates) and interpret bootstrapping proportions as measures of the probability of a clade being present. In order to apply these decision rules to bootstrapping analyses, they study the correlation between bootstrap proportions for clades and the posterior probability of those clades.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Summarizing a posterior distribution of trees using agreement subtrees.

Bayesian inference of phylogeny is unique among phylogenetic reconstruction methods in that it produces a posterior distribution of trees rather than a point estimate of the best tree. The most common way to summarize this distribution is to report the majority-rule consensus tree annotated with the marginal posterior probabilities of each partition. Reporting a single tree discards information...

متن کامل

Polynomial-Time Algorithms for Building a Consensus MUL-Tree

A multi-labeled phylogenetic tree, or MUL-tree, is a generalization of a phylogenetic tree that allows each leaf label to be used many times. MUL-trees have applications in biogeography, the study of host-parasite cospeciation, gene evolution studies, and computer science. Here, we consider the problem of inferring a consensus MUL-tree that summarizes a given set of conflicting MUL-trees, and p...

متن کامل

Algorithms for Building Consensus MUL-trees

A MUL-tree is a generalization of a phylogenetic tree that allows the same leaf label to be used many times. Lott et al. [9,10] recently introduced the problem of inferring a so-called consensus MUL-tree from a set of conflicting MUL-trees and gave an exponential-time algorithm for a special greedy variant. Here, we study strict and majority rule consensus MUL-trees, and present the first ever ...

متن کامل

Running head: PROPERTIES OF MAJORITY-RULE SUPERTREES Properties of Majority-Rule Supertrees

2 Supertree methods assemble many smaller phylogenetic trees, called input trees, into a larger phylogenetic tree, a supertree, whose taxon set is the union of the taxon sets of the input trees. This synthesis can provide a high-level perspective that is harder to attain from individual trees. A recent example of the use of this approach is the species-level phylogeny of nearly all extant Mamma...

متن کامل

A Linear-Time Majority Tree Algorithm

We give a randomized linear-time algorithm for computing the majority rule consensus tree. The majority rule tree is widely used for summarizing a set of phylogenetic trees, which is usually a postprocessing step in constructing a phylogeny. We are implementing the algorithm as part of an interactive visualization system for exploring distributions of trees, where speed is a serious concern for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Systematic biology

دوره 57 5  شماره 

صفحات  -

تاریخ انتشار 2008